AITopics

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsNov-20-2025, 23:05:56 GMT

Multiple Instance Learning for Efficient Sequential Data Classification on Resource-constrained Devices

We study the problem of fast and efficient classification of sequential data (such as time-series) on tiny devices, which is critical for various IoT related applications like audio keyword detection or gesture detection. Such tasks are cast as a standard classification task by sliding windows over the data stream to construct data points. Deploying such classification modules on tiny devices is challenging as predictions over sliding windows of data need to be invoked continuously at a high frequency. Each such predictor instance in itself is expensive as it evaluates large models over long windows of data. In this paper, we address this challenge by exploiting the following two observations about classification tasks arising in typical IoT related applications: (a) the signature of a particular class (e.g. an audio keyword) typically occupies a small fraction of the overall data, and (b) class signatures tend to be discernible early on in the data. We propose a method, EMI-RNN, that exploits these observations by using a multiple instance learning formulation along with an early prediction technique to learn a model that achieves better accuracy compared to baseline models, while simultaneously reducing computation by a large fraction. For instance, on a gesture detection benchmark [ 25 ], EMI-RNN improves standard LSTM model's accuracy by up to 1% while requiring 72x less computation. This enables us to deploy such models for continuous real-time prediction on a small device such as Raspberry Pi0 and Arduino variants, a task that the baseline LSTM could not achieve. Finally, we also provide an analysis of our multiple instance learning algorithm in a simple setting and show that the proposed algorithm converges to the global optima at a linear rate, one of the first such result in this domain.

efficient sequential data classification, name change, resource-constrained device, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Trirat, Patara, Lee, Jae-Gil

MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained Devices

arXiv.org Artificial IntelligenceOct-9-2025

The growing use of smartphones and IoT devices necessitates efficient time-series analysis on resource-constrained hardware, which is critical for sensing applications such as human activity recognition and air quality prediction. Recent efforts in hardware-aware neural architecture search (NAS) automate architecture discovery for specific platforms; however, none focus on general time-series analysis with edge deployment. Leveraging the problem-solving and reasoning capabilities of large language models (LLM), we propose MONAQ, a novel framework that reformulates NAS into Multi-Objective Neural Architecture Querying tasks. MONAQ is equipped with multimodal query generation for processing multimodal time-series inputs and hardware constraints, alongside an LLM agent-based multi-objective search to achieve deployment-ready models via code generation. By integrating numerical data, time-series images, and textual descriptions, MONAQ improves an LLM's understanding of time-series data. Experiments on fifteen datasets demonstrate that MONAQ-discovered models outperform both handcrafted models and NAS baselines while being more efficient.

constraint, large language model, machine learning, (19 more...)

2505.10607

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (0.93)
Information Technology (0.69)
Health & Medicine > Consumer Health (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Time Series Analysis (0.81)

Vu, Quynh Nguyen-Phuong, Martinez-Rau, Luciano Sebastian, Zhang, Yuxuan, Tran, Nho-Duc, Oelmann, Bengt, Magno, Michele, Bader, Sebastian

Efficient Continual Learning in Keyword Spotting using Binary Neural Networks

arXiv.org Artificial IntelligenceAug-15-2025

Keyword spotting (KWS) is an essential function that enables interaction with ubiquitous smart devices. However, in resource-limited devices, KWS models are often static and can thus not adapt to new scenarios, such as added keywords. To overcome this problem, we propose a Continual Learning (CL) approach for KWS built on Binary Neural Networks (BNNs). The framework leverages the reduced computation and memory requirements of BNNs while incorporating techniques that enable the seamless integration of new keywords over time. This study evaluates seven CL techniques on a 16-class use case, reporting an accuracy exceeding 95% for a single additional keyword and up to 86% for four additional classes. Sensitivity to the amount of training samples in the CL phase, and differences in computational complexities are being evaluated. These evaluations demonstrate that batch-based algorithms are more sensitive to the CL dataset size, and that differences between the computational complexities are insignificant. These findings highlight the potential of developing an effective and computationally efficient technique for continuously integrating new keywords in KWS applications that is compatible with resource-constrained devices.

algorithm, artificial intelligence, machine learning, (16 more...)

doi: 10.1109/SAS65169.2025.11105106

2505.02469

Country: Europe > Sweden (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ahmed, Abdullah, Gummeson, Jeremy

Latent Sensor Fusion: Multimedia Learning of Physiological Signals for Resource-Constrained Devices

arXiv.org Artificial IntelligenceJul-22-2025

Latent spaces offer an efficient and effective means of summarizing data while implicitly preserving meta-information through relational encoding. We leverage these meta-embeddings to develop a modality-agnostic, unified encoder. Our method employs sensor-latent fusion to analyze and correlate multimodal physiological signals. Using a compressed sensing approach with autoencoder-based latent space fusion, we address the computational challenges of biosignal analysis on resource-constrained devices. Experimental results show that our unified encoder is significantly faster, lighter, and more scalable than modality-specific alternatives, without compromising representational accuracy.

artificial intelligence, machine learning, natural language, (15 more...)

doi: 10.1145/3731715.3733482

2507.14185

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.15)

Genre: Research Report (0.84)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Neural Information Processing SystemsMay-27-2025, 15:02:41 GMT

SILENCE: Protecting privacy in offloaded speech understanding on resource-constrained devices

Speech serves as a ubiquitous input interface for embedded mobile devices. Cloud-based solutions, while offering powerful speech understanding services, raise significant concerns regarding user privacy. To address this, disentanglement-based encoders have been proposed to remove sensitive information from speech signals without compromising the speech understanding functionality. However, these encoders demand high memory usage and computation complexity, making them impractical for resource-constrained wimpy devices.Our solution is based on a key observation that speech understanding hinges on long-term dependency knowledge of the entire utterance, in contrast to privacy-sensitive elements that are short-term dependent. Exploiting this observation, we propose SILENCE, a lightweight system that selectively obscuring short-term details, without damaging the long-term dependent speech understanding performance.The crucial part of SILENCE is a differential mask generator derived from interpretable learning to automatically configure the masking process.We have implemented SILENCE on the STM32H7 microcontroller and evaluate its efficacy under different attacking scenarios.

protecting privacy, resource-constrained device, silence, (1 more...)

Industry:

Information Technology > Security & Privacy (0.43)
Law > Civil Rights & Constitutional Law (0.40)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Neural Information Processing SystemsMay-27-2025, 06:39:58 GMT

Thinking Forward: Memory-Efficient Federated Finetuning of Language Models

accuracy, memory-efficient federated finetuning, thinking forward, (11 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Feng, Chao, Huber, Nicolas, Celdran, Alberto Huertas, Bovet, Gerome, Stiller, Burkhard

Demo: A Practical Testbed for Decentralized Federated Learning on Physical Edge Devices

arXiv.org Artificial IntelligenceMay-14-2025

--Federated Learning (FL) enables collaborative model training without sharing raw data, preserving participant privacy. Decentralized FL (DFL) eliminates reliance on a central server, mitigating the single point of failure inherent in the traditional FL paradigm, while introducing deployment challenges on resource-constrained devices. T o evaluate real-world applicability, this work designs and deploys a physical testbed using edge devices such as Raspberry Pi and Jetson Nano. The testbed is built upon a DFL training platform, NEBULA, and extends it with a power monitoring module to measure energy consumption during training. Experiments across multiple datasets show that model performance is influenced by the communication topology, with denser topologies leading to better outcomes in DFL settings.

artificial intelligence, federated learning, machine learning, (18 more...)

2505.08033

Country: Europe > Switzerland (0.15)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (0.47)
Information Technology > Hardware (0.39)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Hasan, Syed Mhamudul, Zangoti, Hussein, Anagnostopoulos, Iraklis, Shahid, Abdur R.

Sponge Attacks on Sensing AI: Energy-Latency Vulnerabilities and Defense via Model Pruning

arXiv.org Artificial IntelligenceMay-13-2025

Recent studies have shown that sponge attacks can significantly increase the energy consumption and inference latency of deep neural networks (DNNs). However, prior work has focused primarily on computer vision and natural language processing tasks, overlooking the growing use of lightweight AI models in sensing-based applications on resource-constrained devices, such as those in Internet of Things (IoT) environments. These attacks pose serious threats of energy depletion and latency degradation in systems where limited battery capacity and real-time responsiveness are critical for reliable operation. This paper makes two key contributions. First, we present the first systematic exploration of energy-latency sponge attacks targeting sensing-based AI models. Using wearable sensing-based AI as a case study, we demonstrate that sponge attacks can substantially degrade performance by increasing energy consumption, leading to faster battery drain, and by prolonging inference latency. Second, to mitigate such attacks, we investigate model pruning, a widely adopted compression technique for resource-constrained AI, as a potential defense. Our experiments show that pruning-induced sparsity significantly improves model resilience against sponge poisoning. We also quantify the trade-offs between model efficiency and attack resilience, offering insights into the security implications of model compression in sensing-based AI systems deployed in IoT environments.

artificial intelligence, machine learning, natural language, (17 more...)

2505.06454

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.47)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Energy Storage (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

arXiv.org Artificial IntelligenceMar-23-2025

SplitFrozen: Split Learning with Device-side Model Frozen for Fine-Tuning LLM on Heterogeneous Resource-Constrained Devices

Ma, Jian, Lyu, Xinchen, Jiang, Jun, Cui, Qimei, Yao, Haipeng, Tao, Xiaofeng

Fine-tuning large language models (LLMs) on private, on-device data can empower tailored personalized AI agents. However, fine-tuning LLMs on resource-constrained edge devices faces significant challenges, including excessive computation overhead, device heterogeneity, and data imbalance. This paper proposes SplitFrozen, a split learning framework that enables efficient LLM fine-tuning by strategically freezing device-side model layers while centralizing parameter-efficient fine-tuning on the server. Our framework partitions LLMs into device-side frozen layers and server-side fine-tuning layers, where heterogeneous resource-constrained devices execute only forward propagation. To minimize server-side training costs, we integrate Low-Rank Adaptation (LoRA) into the server-side layers. A pipeline parallelism strategy further optimizes training efficiency by decoupling device-server computations and leveraging decomposed backward propagation. Experiments on GPT-2 with the MRPC, MNLI-matched, and SST-2 datasets demonstrate that SplitFrozen outperforms FedLoRA and SplitLoRA by 69.4\% model accuracy under extremely imbalanced data, while reducing up to 86.8\% device-side computations and 50.2\% total training time. Experiments also validate the scalability of SplitFrozen on content generation task using Llama-3.2 model on GSM8K dataset.

large language model, machine learning, natural language, (17 more...)

2503.18986

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)